Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
نویسندگان
چکیده
منابع مشابه
Video Summarization with Attention-Based Encoder-Decoder Networks
This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence learning problem, where the input is a sequence of original video frames, the output is a keyshot sequence. Our key idea is to learn a deep summarization network with attention mechanism to mimic the way of selecting the keyshots of human. To this end, we propose a novel video summariz...
متن کاملAttention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks
With the rising number of interconnected devices and sensors, modeling distributed sensor networks is of increasing interest. Recurrent neural networks (RNN) are considered particularly well suited for modeling sensory and streaming data. When predicting future behavior, incorporating information from neighboring sensor stations is often beneficial. We propose a new RNN based architecture for c...
متن کاملImplicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model
Neural machine translation has shown very promising results lately. Most NMT models follow the encoder-decoder framework. To make encoder-decoder models more flexible, attention mechanism was introduced to machine translation and also other tasks like speech recognition and image captioning. We observe that the quality of translation by attention-based encoder-decoder can be significantly damag...
متن کاملDecoupling Encoder and Decoder Networks for Abstractive Document Summarization
Abstractive document summarization seeks to automatically generate a summary for a document, based on some abstract “understanding” of the original document. State-of-the-art techniques traditionally use attentive encoder–decoder architectures. However, due to the large number of parameters in these models, they require large training datasets and long training times. In this paper, we propose ...
متن کاملPixel-wise Ear Detection with Convolutional Encoder-Decoder Networks
Object detection and segmentation represents the basis for many tasks in computer and machine vision. In biometric recognition systems the detection of the region-of-interest (ROI) is one of the most crucial steps in the overall processing pipeline, significantly impacting the performance of the entire recognition system. Existing approaches to ear detection, for example, are commonly susceptib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Multimedia
سال: 2015
ISSN: 1520-9210,1941-0077
DOI: 10.1109/tmm.2015.2477044